-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
variable-sized chunks with zarr v3 #10880
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
xarray/backends/zarr.py
Outdated
| # while dask chunks can be variable sized | ||
| # https://dask.pydata.org/en/latest/array-design.html#chunks | ||
| if var_chunks and not enc_chunks: | ||
| if zarr_format == 3: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this check is probably not sufficient
I just pushed tags to my fork! |
|
thanks, I've changed the example back to using your fork |
xarray/backends/zarr.py
Outdated
| if any(len(set(chunks[:-1])) > 1 for chunks in var_chunks): | ||
| raise ValueError( | ||
| "Zarr requires uniform chunk sizes except for final chunk. " | ||
| "Zarr v2 requires uniform chunk sizes except for final chunk. " | ||
| f"Variable named {name!r} has incompatible dask chunks: {var_chunks!r}. " | ||
| "Consider rechunking using `chunk()`." | ||
| ) | ||
| if any((chunks[0] < chunks[-1]) for chunks in var_chunks): | ||
| raise ValueError( | ||
| "Final chunk of Zarr array must be the same size or smaller " | ||
| "Final chunk of a Zarr v2 array must be the same size or smaller " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not correct - it's unfortunately not as simple as "Zarr V3 supports variable-length chunking but Zarr V2 doesn't".
|
@keewis - zarr-developers/zarr-python#3534 is approaching a mergable state. Curious if you want to take another pass through this PR before we merge it and provide any feedback. |
|
sure. Do you know if there's a specific tell on whether rectilinear chunks are available? So far I've been using I've posted a comment to the zarr PR (which doesn't seem to really affect the code here). Finally, I still need to figure out how to change |
| f"Variable named {name!r} has incompatible dask chunks: {var_chunks!r}. " | ||
| "Consider rechunking using `chunk()`." | ||
| "Consider rechunking using `chunk()`, or switching to the " | ||
| "zarr v3 format with zarr-python>=3.2." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still struggle with accurately expressing the prerequisites for rectilinear chunk support. Maybe this is fine, but we could also ask for "rectilinear chunk support"?
| "zarr v3 format with zarr-python>=3.2." | |
| "zarr v3 format with enabled rectilinear chunk support." |
| dask = { git = "https://github.com/dask/dask" } | ||
| distributed = { git = "https://github.com/dask/distributed" } | ||
| zarr = { git = "https://github.com/zarr-developers/zarr-python" } | ||
| zarr = { git = "https://github.com/jhamman/zarr-python", branch = "feature/rectilinear-chunk-grid" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
revert before merging:
| zarr = { git = "https://github.com/jhamman/zarr-python", branch = "feature/rectilinear-chunk-grid" } | |
| zarr = { git = "https://github.com/zarr-developers/zarr-python" } |
whats-new.rstBuilding on top of zarr-developers/zarr-python#3534, this is a draft PR that allows writing variable-sized chunks to
zarr.To see this in action, try:
At the moment, this requires
safe_chunks=Falsebecause I didn't change the chunk alignment machinery, yet.cc @d-v-b, @jhamman, @dcherian